This vignette assumes a SQL server at localhost (we use PostgreSQL), with data in OMOP Common Data Model v5.4 format in schema cdm_new_york3. The patient records shown in this example are synthetic data from Synthea(TM) Patient Generator.

library(phea)
library(dplyr)

# Connect to SQL server.
dbcon <- DBI::dbConnect(RPostgres::Postgres(),
  host = 'localhost', port = 7654, dbname = 'fort',
  user = cred$pg$user, password = cred$pg$pass)

# Call setup_phea so we can use sqlt() and sql0().
setup_phea(dbcon, 'cdm_new_york3')

In this vignette we identify:

Case A.

Case B.

Case C.

Here’s how we compute it:

In both cases, we will normalize the unit of measurement of serum creatinine to md/dL prior to computing formulas. We could also use formulas for that, and the result would be the same.

Create components

Serum creatinine

We collect SCr records from MEASUREMENT and convert the units to mg/dL.

# Serum creatinine codes used: 
# Loinc 38483-4 Creatinine [Mass/volume] in Blood, OMOP CDM concept ID 3051825
# Loinc 2160-0 Creatinine [Mass/volume] in Serum or Plasma, OMOP CDM concept ID 3016723

# "A" records: Unit is 'mg/dL'.
scr_records_a <- sqlt(measurement) |>
  filter(measurement_concept_id %in% c(3051825, 3016723) &&
    unit_source_value == 'mg/dL')

# "B" records: Unit is 'µmol/L', but we convert to 'mg/dL'.
scr_records_b <- sqlt(measurement) |>
  filter(measurement_concept_id %in% c(3051825, 3016723) &&
    unit_source_value == 'µmol/L') |>
  mutate( # Convert µmol/L to mg/dL
    value_as_number = value_as_number / 88.42,
    unit_source_value = 'mg/dL')

# Combine all available serum creatinine records.
scr_records <- union_all(scr_records_a, scr_records_b)

# Make a record source.
scr_record_source <- make_record_source(scr_records,
  ts = measurement_datetime,
  pid = person_id)

Glomerular filtration rate

We collect GFR records from MEASUREMENT.

# GFR codes used:
# Loinc 77147-7 Glomerular filtration rate/1.73 sq M.predicted [Volume Rate/Area] in Serum, Plasma or Blood by Creatinine-based formula (MDRD)
# OMOP CDM concept ID 46236952 
gfr_records <- sqlt(measurement) |>
  filter(measurement_concept_id == 46236952)

gfr_record_source <- make_record_source(gfr_records,
    ts = measurement_datetime,
    pid = person_id)

Calculate the phenotype

Formula scr_case_a contains the logic of case A (difference ≥0.3 mg/dL), scr_case_b contains case B (ratio ≥1.5), and gfr_case_c contains case C (two GFRs under 60 within 3-5 months).

Notice we use 48 days instead of 48 hours, and 7 months instead of 7 days. This is just to make a visually meaningful example out of the limited data that is produced by Synthea(TM). Those time intervals can be changed to “48 hours” and “7 days” to produce the real phenotype.

scr_change <- calculate_formula(
  components = list(
    # Current SCr
    scr = make_component(scr_record_source),
    
    # Minimum value within 48-hour window
    scr_48h_min = make_component(scr_record_source,
      window = '48 days', .delay_fn = 'min'),
    
    # Minimum value within 7-day window
    scr_7d_min = make_component(scr_record_source,
      window = '7 months', .delay_fn = 'min'),
    
    # Current glomerular filtration rate (GFR)
    gfr = make_component(gfr_record_source),
    
    # Glomerular filtration rate 3 to 5 months older than phenotype date
    gfr_prior = make_component(gfr_record_source,
      delay = '3 months', window = '5 months', .delay_fn = 'min')),
    
  fml = list(
    scr_case_a = 'scr_value_as_number - scr_48h_min_value_as_number >= 0.3',
    
    scr_case_b = 'scr_value_as_number / scr_7d_min_value_as_number >= 1.5',
    
    gfr_case_c = 'gfr_value_as_number < 60 AND gfr_prior_value_as_number < 60'),
  
  export = c(
    'scr_measurement_datetime',
    'scr_48h_min_measurement_datetime',
    'scr_7d_min_measurement_datetime',
    'gfr_measurement_datetime',
    'gfr_prior_measurement_datetime'),
  
  .cascaded = FALSE, # Because we don't need to use results of prior formulas inside other formulas.
)

Let us take a small peek at 15 rows from the phenotype results.

head_shot(scr_change, 15) |>
  kable()
row_id pid ts window scr_value_as_number scr_48h_min_value_as_number scr_7d_min_value_as_number gfr_value_as_number gfr_prior_value_as_number scr_measurement_datetime scr_48h_min_measurement_datetime scr_7d_min_measurement_datetime gfr_measurement_datetime gfr_prior_measurement_datetime scr_case_a scr_case_b gfr_case_c
1 1 2014-03-07 00:00:00 1.4 1.4 1.4 124.5 NA 2014-03-07 2014-03-07 2014-03-07 2014-03-07 NA FALSE FALSE FALSE
2 1 2016-03-11 00:00:00 1.3 1.3 1.3 120.9 NA 2016-03-11 2016-03-11 2016-03-11 2016-03-11 NA FALSE FALSE FALSE
3 1 2018-03-16 00:00:00 1.3 1.3 1.3 151.7 NA 2018-03-16 2018-03-16 2018-03-16 2018-03-16 NA FALSE FALSE FALSE
12 1 2020-03-20 00:00:00 1.3 1.3 1.3 107.3 NA 2020-03-20 2020-03-20 2020-03-20 2020-03-20 NA FALSE FALSE FALSE
5 1 2022-02-11 00:00:00 1.8 1.5 1.5 84.8 NA 2022-02-11 2022-02-11 2022-02-11 2022-02-11 NA TRUE FALSE FALSE
16 1 2022-03-25 42 days 1.5 1.5 1.5 84.1 NA 2022-03-25 2022-02-11 2022-02-11 2022-03-25 NA FALSE FALSE FALSE
17 3 2004-11-26 00:00:00 1.5 1.5 1.5 112.0 NA 2004-11-26 2004-11-26 2004-11-26 2004-11-26 NA FALSE FALSE FALSE
47 3 2005-12-02 00:00:00 1.5 1.5 1.5 114.0 NA 2005-12-02 2005-12-02 2005-12-02 2005-12-02 NA FALSE FALSE FALSE
19 3 2006-12-08 00:00:00 1.5 1.5 1.5 111.9 NA 2006-12-08 2006-12-08 2006-12-08 2006-12-08 NA FALSE FALSE FALSE
49 3 2007-12-14 00:00:00 1.4 1.4 1.4 131.9 NA 2007-12-14 2007-12-14 2007-12-14 2007-12-14 NA FALSE FALSE FALSE
50 3 2008-12-19 00:00:00 1.3 1.3 1.3 147.5 NA 2008-12-19 2008-12-19 2008-12-19 2008-12-19 NA FALSE FALSE FALSE
22 3 2009-12-25 00:00:00 1.3 1.3 1.3 149.6 NA 2009-12-25 2009-12-25 2009-12-25 2009-12-25 NA FALSE FALSE FALSE
23 3 2010-12-31 00:00:00 1.3 1.3 1.3 132.3 NA 2010-12-31 2010-12-31 2010-12-31 2010-12-31 NA FALSE FALSE FALSE
24 3 2011-04-15 105 days 1.4 1.4 1.3 142.0 132.3 2011-04-15 2011-04-15 2010-12-31 2011-04-15 2010-12-31 FALSE FALSE FALSE
54 3 2012-01-06 371 days 1.5 1.5 1.5 91.6 132.3 2012-01-06 2012-01-06 2012-01-06 2012-01-06 2010-12-31 FALSE FALSE FALSE

Plot the phenotype for a random patient

For demonstration purposes, let us pick an “interesting” patient. By “interesting” I just mean a patient who at different times was TRUE and FALSE in each of the three criteria. A patient with variability in their timeline.

# Find an "interesting" case to plot: patients with each criteria at times TRUE, at times FALSE
patients <- scr_change |>
  group_by(pid) |>
  summarise(
    n_03 = n_distinct(scr_case_a),
    n_15 = n_distinct(scr_case_b),
    n_gfr = n_distinct(gfr_case_c)) |>
  mutate(sort_variable = n_03 + n_15 + n_gfr) |>
  arrange(desc(sort_variable)) |>
  head(20) |>
  select(pid) |>
  pull()

random_patient <- sample(patients, 1)

message('Sampled patient: ', random_patient)
#> Sampled patient: 37

Then we plot all data for the chosen patient (pid = 37).

scr_change |>
  select(-ends_with('datetime')) |>
  phea_plot(random_patient)
#> Collecting lazy table, done. (turn this message off with `verbose = FALSE`)

At the end of this report I include a large table with all the data for this patient, directly from the record sources, for maximum verification.

Obtain the SQL query that computes the phenotype

To see the SQL query underlying the phenotype, use helper function code_shot(), or dbplyr::sql_render(), or the .clip_sql option in calculate_formula().

code_shot(scr_change)
SELECT
  "row_id",
  "pid",
  "ts",
  "window",
  "scr_value_as_number",
  "scr_48h_min_value_as_number",
  "scr_7d_min_value_as_number",
  "gfr_value_as_number",
  "gfr_prior_value_as_number",
  "scr_measurement_datetime",
  "scr_48h_min_measurement_datetime",
  "scr_7d_min_measurement_datetime",
  "gfr_measurement_datetime",
  "gfr_prior_measurement_datetime",
  scr_value_as_number - scr_48h_min_value_as_number >= 0.3 AS "scr_case_a",
  scr_value_as_number / scr_7d_min_value_as_number >= 1.5 AS "scr_case_b",
  gfr_value_as_number < 60 AND gfr_prior_value_as_number < 60 AS "gfr_case_c"
FROM (
  SELECT
    *,
    "ts" - least(scr_ts, scr_48h_min_ts, scr_7d_min_ts, gfr_ts, gfr_prior_ts) AS "window",
    last_value(row_id) over (partition by "pid", "ts") AS "ts_row"
  FROM (
    SELECT
      "row_id",
      "pid",
      "ts",
      MAX("scr_measurement_datetime") OVER (PARTITION BY "pid", "..dbplyr_partion_1") AS "scr_measurement_datetime",
      MAX("scr_value_as_number") OVER (PARTITION BY "pid", "..dbplyr_partion_2") AS "scr_value_as_number",
      MAX("scr_ts") OVER (PARTITION BY "pid", "..dbplyr_partion_3") AS "scr_ts",
      MAX("scr_48h_min_measurement_datetime") OVER (PARTITION BY "pid", "..dbplyr_partion_4") AS "scr_48h_min_measurement_datetime",
      MAX("scr_48h_min_value_as_number") OVER (PARTITION BY "pid", "..dbplyr_partion_5") AS "scr_48h_min_value_as_number",
      MAX("scr_48h_min_ts") OVER (PARTITION BY "pid", "..dbplyr_partion_6") AS "scr_48h_min_ts",
      MAX("scr_7d_min_measurement_datetime") OVER (PARTITION BY "pid", "..dbplyr_partion_7") AS "scr_7d_min_measurement_datetime",
      MAX("scr_7d_min_value_as_number") OVER (PARTITION BY "pid", "..dbplyr_partion_8") AS "scr_7d_min_value_as_number",
      MAX("scr_7d_min_ts") OVER (PARTITION BY "pid", "..dbplyr_partion_9") AS "scr_7d_min_ts",
      MAX("gfr_measurement_datetime") OVER (PARTITION BY "pid", "..dbplyr_partion_10") AS "gfr_measurement_datetime",
      MAX("gfr_value_as_number") OVER (PARTITION BY "pid", "..dbplyr_partion_11") AS "gfr_value_as_number",
      MAX("gfr_ts") OVER (PARTITION BY "pid", "..dbplyr_partion_12") AS "gfr_ts",
      MAX("gfr_prior_measurement_datetime") OVER (PARTITION BY "pid", "..dbplyr_partion_13") AS "gfr_prior_measurement_datetime",
      MAX("gfr_prior_value_as_number") OVER (PARTITION BY "pid", "..dbplyr_partion_14") AS "gfr_prior_value_as_number",
      MAX("gfr_prior_ts") OVER (PARTITION BY "pid", "..dbplyr_partion_15") AS "gfr_prior_ts"
    FROM (
      SELECT
        *,
        SUM(CASE WHEN (("scr_measurement_datetime" IS NULL)) THEN 0 ELSE 1 END) OVER (PARTITION BY "pid" ORDER BY "pid", "ts" ROWS UNBOUNDED PRECEDING) AS "..dbplyr_partion_1",
        SUM(CASE WHEN (("scr_value_as_number" IS NULL)) THEN 0 ELSE 1 END) OVER (PARTITION BY "pid" ORDER BY "pid", "ts" ROWS UNBOUNDED PRECEDING) AS "..dbplyr_partion_2",
        SUM(CASE WHEN (("scr_ts" IS NULL)) THEN 0 ELSE 1 END) OVER (PARTITION BY "pid" ORDER BY "pid", "ts" ROWS UNBOUNDED PRECEDING) AS "..dbplyr_partion_3",
        SUM(CASE WHEN (("scr_48h_min_measurement_datetime" IS NULL)) THEN 0 ELSE 1 END) OVER (PARTITION BY "pid" ORDER BY "pid", "ts" ROWS UNBOUNDED PRECEDING) AS "..dbplyr_partion_4",
        SUM(CASE WHEN (("scr_48h_min_value_as_number" IS NULL)) THEN 0 ELSE 1 END) OVER (PARTITION BY "pid" ORDER BY "pid", "ts" ROWS UNBOUNDED PRECEDING) AS "..dbplyr_partion_5",
        SUM(CASE WHEN (("scr_48h_min_ts" IS NULL)) THEN 0 ELSE 1 END) OVER (PARTITION BY "pid" ORDER BY "pid", "ts" ROWS UNBOUNDED PRECEDING) AS "..dbplyr_partion_6",
        SUM(CASE WHEN (("scr_7d_min_measurement_datetime" IS NULL)) THEN 0 ELSE 1 END) OVER (PARTITION BY "pid" ORDER BY "pid", "ts" ROWS UNBOUNDED PRECEDING) AS "..dbplyr_partion_7",
        SUM(CASE WHEN (("scr_7d_min_value_as_number" IS NULL)) THEN 0 ELSE 1 END) OVER (PARTITION BY "pid" ORDER BY "pid", "ts" ROWS UNBOUNDED PRECEDING) AS "..dbplyr_partion_8",
        SUM(CASE WHEN (("scr_7d_min_ts" IS NULL)) THEN 0 ELSE 1 END) OVER (PARTITION BY "pid" ORDER BY "pid", "ts" ROWS UNBOUNDED PRECEDING) AS "..dbplyr_partion_9",
        SUM(CASE WHEN (("gfr_measurement_datetime" IS NULL)) THEN 0 ELSE 1 END) OVER (PARTITION BY "pid" ORDER BY "pid", "ts" ROWS UNBOUNDED PRECEDING) AS "..dbplyr_partion_10",
        SUM(CASE WHEN (("gfr_value_as_number" IS NULL)) THEN 0 ELSE 1 END) OVER (PARTITION BY "pid" ORDER BY "pid", "ts" ROWS UNBOUNDED PRECEDING) AS "..dbplyr_partion_11",
        SUM(CASE WHEN (("gfr_ts" IS NULL)) THEN 0 ELSE 1 END) OVER (PARTITION BY "pid" ORDER BY "pid", "ts" ROWS UNBOUNDED PRECEDING) AS "..dbplyr_partion_12",
        SUM(CASE WHEN (("gfr_prior_measurement_datetime" IS NULL)) THEN 0 ELSE 1 END) OVER (PARTITION BY "pid" ORDER BY "pid", "ts" ROWS UNBOUNDED PRECEDING) AS "..dbplyr_partion_13",
        SUM(CASE WHEN (("gfr_prior_value_as_number" IS NULL)) THEN 0 ELSE 1 END) OVER (PARTITION BY "pid" ORDER BY "pid", "ts" ROWS UNBOUNDED PRECEDING) AS "..dbplyr_partion_14",
        SUM(CASE WHEN (("gfr_prior_ts" IS NULL)) THEN 0 ELSE 1 END) OVER (PARTITION BY "pid" ORDER BY "pid", "ts" ROWS UNBOUNDED PRECEDING) AS "..dbplyr_partion_15"
      FROM (
        SELECT
          row_number() over () AS "row_id",
          "pid",
          "ts",
          last_value(case when "name" = 'dyla2veqz3cf' then "measurement_datetime" else null end) over (partition by "pid", "name" order by "ts" rows between unbounded preceding and 0 preceding) AS "scr_measurement_datetime",
          last_value(case when "name" = 'dyla2veqz3cf' then "value_as_number" else null end) over (partition by "pid", "name" order by "ts" rows between unbounded preceding and 0 preceding) AS "scr_value_as_number",
          last_value(case when "name" = 'dyla2veqz3cf' then "ts" else null end) over (partition by "pid", "name" order by "ts" rows between unbounded preceding and 0 preceding) AS "scr_ts",
          min(case when "name" = 'dyla2veqz3cf' then "measurement_datetime" else null end) over (partition by "pid", "name" order by "ts" range between '48 days'::interval preceding and '0 days'::interval preceding) AS "scr_48h_min_measurement_datetime",
          min(case when "name" = 'dyla2veqz3cf' then "value_as_number" else null end) over (partition by "pid", "name" order by "ts" range between '48 days'::interval preceding and '0 days'::interval preceding) AS "scr_48h_min_value_as_number",
          min(case when "name" = 'dyla2veqz3cf' then "ts" else null end) over (partition by "pid", "name" order by "ts" range between '48 days'::interval preceding and '0 days'::interval preceding) AS "scr_48h_min_ts",
          min(case when "name" = 'dyla2veqz3cf' then "measurement_datetime" else null end) over (partition by "pid", "name" order by "ts" range between '7 months'::interval preceding and '0 days'::interval preceding) AS "scr_7d_min_measurement_datetime",
          min(case when "name" = 'dyla2veqz3cf' then "value_as_number" else null end) over (partition by "pid", "name" order by "ts" range between '7 months'::interval preceding and '0 days'::interval preceding) AS "scr_7d_min_value_as_number",
          min(case when "name" = 'dyla2veqz3cf' then "ts" else null end) over (partition by "pid", "name" order by "ts" range between '7 months'::interval preceding and '0 days'::interval preceding) AS "scr_7d_min_ts",
          last_value(case when "name" = 'zyap84s7wcid' then "measurement_datetime" else null end) over (partition by "pid", "name" order by "ts" rows between unbounded preceding and 0 preceding) AS "gfr_measurement_datetime",
          last_value(case when "name" = 'zyap84s7wcid' then "value_as_number" else null end) over (partition by "pid", "name" order by "ts" rows between unbounded preceding and 0 preceding) AS "gfr_value_as_number",
          last_value(case when "name" = 'zyap84s7wcid' then "ts" else null end) over (partition by "pid", "name" order by "ts" rows between unbounded preceding and 0 preceding) AS "gfr_ts",
          min(case when "name" = 'zyap84s7wcid' then "measurement_datetime" else null end) over (partition by "pid", "name" order by "ts" range between '5 months'::interval preceding and '3 months'::interval preceding) AS "gfr_prior_measurement_datetime",
          min(case when "name" = 'zyap84s7wcid' then "value_as_number" else null end) over (partition by "pid", "name" order by "ts" range between '5 months'::interval preceding and '3 months'::interval preceding) AS "gfr_prior_value_as_number",
          min(case when "name" = 'zyap84s7wcid' then "ts" else null end) over (partition by "pid", "name" order by "ts" range between '5 months'::interval preceding and '3 months'::interval preceding) AS "gfr_prior_ts"
        FROM (
          (
            SELECT
              'dyla2veqz3cf' AS "name",
              "person_id" AS "pid",
              "measurement_datetime" AS "ts",
              "measurement_datetime",
              "value_as_number"
            FROM (
              (
                SELECT *
                FROM "cdm_new_york3"."measurement"
                WHERE ("measurement_concept_id" IN (3051825.0, 3016723.0) AND "unit_source_value" = 'mg/dL')
              )
              UNION ALL
              (
                SELECT
                  "measurement_id",
                  "person_id",
                  "measurement_concept_id",
                  "measurement_date",
                  "measurement_datetime",
                  "measurement_time",
                  "measurement_type_concept_id",
                  "operator_concept_id",
                  "value_as_number" / 88.42 AS "value_as_number",
                  "value_as_concept_id",
                  "unit_concept_id",
                  "range_low",
                  "range_high",
                  "provider_id",
                  "visit_occurrence_id",
                  "visit_detail_id",
                  "measurement_source_value",
                  "measurement_source_concept_id",
                  'mg/dL' AS "unit_source_value",
                  "unit_source_concept_id",
                  "value_source_value",
                  "measurement_event_id",
                  "meas_event_field_concept_id"
                FROM "cdm_new_york3"."measurement"
                WHERE ("measurement_concept_id" IN (3051825.0, 3016723.0) AND "unit_source_value" = 'µmol/L')
              )
            ) "q01"
          )
          UNION ALL
          (
            SELECT
              'zyap84s7wcid' AS "name",
              "person_id" AS "pid",
              "measurement_datetime" AS "ts",
              "measurement_datetime",
              "value_as_number"
            FROM "cdm_new_york3"."measurement"
            WHERE ("measurement_concept_id" = 46236952.0)
          )
        ) "q02"
      ) "q03"
    ) "q04"
  ) "q05"
) "q06"
WHERE ("row_id" = "ts_row")

See the raw data that went into calculate_formula()

Sometimes the best way to check a result is to manually, painstakingly go over the data. Below I print all rows from the record sources for the patient that was plotted above.

Serum creatinine records

scr_records |>
  filter(person_id == random_patient) |>
  collect() |>
  arrange(measurement_datetime) |>
  select(measurement_id, person_id, measurement_datetime, value_as_number, unit_source_value) |>
  kable()
measurement_id person_id measurement_datetime value_as_number unit_source_value
14091 37 2003-01-17 7.5 mg/dL
14642 37 2003-05-23 4.4 mg/dL
14597 37 2003-10-17 4.6 mg/dL
12685 37 2004-01-23 5.1 mg/dL
13779 37 2004-08-13 6.0 mg/dL
13895 37 2004-10-15 4.2 mg/dL
14345 37 2005-01-07 5.3 mg/dL
13214 37 2005-01-28 4.7 mg/dL
13309 37 2005-05-13 4.4 mg/dL
13116 37 2005-11-04 4.9 mg/dL
12960 37 2006-01-06 5.7 mg/dL
13097 37 2006-02-03 6.1 mg/dL
13179 37 2006-05-05 4.2 mg/dL
14403 37 2006-09-01 5.1 mg/dL
14709 37 2007-02-02 4.4 mg/dL
13275 37 2007-02-16 4.9 mg/dL
14299 37 2007-04-27 4.3 mg/dL
13967 37 2007-11-23 3.8 mg/dL
13000 37 2008-02-22 4.6 mg/dL
13941 37 2008-06-20 4.6 mg/dL
13707 37 2008-10-24 3.8 mg/dL
13366 37 2009-02-27 6.2 mg/dL
13075 37 2009-04-17 4.2 mg/dL
13731 37 2009-07-17 4.0 mg/dL
13594 37 2009-11-13 5.8 mg/dL
14319 37 2010-03-05 3.3 mg/dL
14326 37 2010-03-05 6.1 mg/dL
14132 37 2010-03-19 4.7 mg/dL
13618 37 2010-06-11 6.0 mg/dL
14461 37 2010-10-08 4.0 mg/dL
13235 37 2010-12-10 4.1 mg/dL
13562 37 2011-03-11 5.6 mg/dL
13390 37 2011-03-26 2.9 mg/dL
14528 37 2011-04-08 5.1 mg/dL
12949 37 2011-09-09 4.5 mg/dL
12842 37 2011-10-07 4.9 mg/dL
12702 37 2012-03-16 5.4 mg/dL
14444 37 2012-03-25 2.9 mg/dL
14034 37 2012-06-01 3.6 mg/dL
13149 37 2012-11-30 4.8 mg/dL
13864 37 2013-03-01 4.4 mg/dL
14532 37 2013-03-22 6.1 mg/dL
13349 37 2013-03-25 3.5 mg/dL
14649 37 2013-09-27 6.2 mg/dL
13800 37 2014-03-25 3.3 mg/dL
13724 37 2014-03-28 4.0 mg/dL
12855 37 2014-04-25 4.4 mg/dL
12779 37 2014-08-22 5.9 mg/dL
12731 37 2014-10-17 3.9 mg/dL
13515 37 2014-12-19 4.6 mg/dL
14241 37 2015-03-25 2.7 mg/dL
14612 37 2015-04-03 3.9 mg/dL
14217 37 2015-11-13 5.3 mg/dL
14009 37 2016-03-24 2.6 mg/dL
12797 37 2016-04-08 6.5 mg/dL
13972 37 2016-06-10 5.6 mg/dL
13159 37 2016-12-09 3.7 mg/dL
13460 37 2017-03-24 3.0 mg/dL
12902 37 2017-04-14 3.7 mg/dL
13428 37 2017-06-09 5.3 mg/dL
14161 37 2017-09-01 4.2 mg/dL
13249 37 2018-01-05 6.1 mg/dL
14192 37 2018-03-02 3.6 mg/dL
12756 37 2018-03-24 3.5 mg/dL
13842 37 2018-04-20 4.6 mg/dL
12919 37 2018-06-01 4.1 mg/dL
14267 37 2019-03-24 2.9 mg/dL
14576 37 2019-04-26 3.6 mg/dL
14176 37 2019-05-24 3.7 mg/dL
13910 37 2019-07-26 5.7 mg/dL
13498 37 2019-10-25 3.5 mg/dL
13672 37 2020-03-23 2.7 mg/dL
13414 37 2020-04-24 4.0 mg/dL
13320 37 2020-05-08 4.5 mg/dL
13011 37 2020-05-22 4.8 mg/dL
14033 37 2020-07-17 3.8 mg/dL
14668 37 2020-12-18 3.5 mg/dL
14418 37 2021-02-12 5.3 mg/dL
14118 37 2021-03-23 2.6 mg/dL
14503 37 2021-05-14 4.9 mg/dL
13633 37 2021-05-28 3.7 mg/dL
13475 37 2021-06-18 3.5 mg/dL
13034 37 2021-08-13 3.8 mg/dL
13686 37 2021-09-10 3.4 mg/dL
14066 37 2022-03-23 3.3 mg/dL
12671 37 2022-05-13 5.9 mg/dL
13828 37 2022-05-27 3.4 mg/dL

Glomerular filtration rate records

gfr_records |>
  filter(person_id == random_patient) |>
  collect() |>
  arrange(measurement_datetime) |>
  select(measurement_id, person_id, measurement_datetime, value_as_number, unit_source_value) |>
  kable()
measurement_id person_id measurement_datetime value_as_number unit_source_value
14080 37 2003-01-17 16.5 mL/min/{1.73_m2}
14633 37 2003-05-23 27.2 mL/min/{1.73_m2}
14564 37 2003-10-17 25.1 mL/min/{1.73_m2}
12655 37 2004-01-23 22.3 mL/min/{1.73_m2}
13803 37 2004-08-13 19.2 mL/min/{1.73_m2}
13898 37 2004-10-15 27.0 mL/min/{1.73_m2}
14340 37 2005-01-07 21.4 mL/min/{1.73_m2}
13216 37 2005-01-28 24.0 mL/min/{1.73_m2}
13298 37 2005-05-13 25.9 mL/min/{1.73_m2}
13105 37 2005-11-04 23.1 mL/min/{1.73_m2}
12965 37 2006-01-06 19.8 mL/min/{1.73_m2}
13080 37 2006-02-03 18.2 mL/min/{1.73_m2}
13181 37 2006-05-05 26.8 mL/min/{1.73_m2}
14392 37 2006-09-01 21.8 mL/min/{1.73_m2}
14702 37 2007-02-02 25.4 mL/min/{1.73_m2}
13285 37 2007-02-16 22.9 mL/min/{1.73_m2}
14285 37 2007-04-27 26.1 mL/min/{1.73_m2}
13956 37 2007-11-23 29.0 mL/min/{1.73_m2}
12996 37 2008-02-22 24.1 mL/min/{1.73_m2}
13929 37 2008-06-20 23.7 mL/min/{1.73_m2}
13712 37 2008-10-24 28.7 mL/min/{1.73_m2}
13354 37 2009-02-27 17.6 mL/min/{1.73_m2}
13064 37 2009-04-17 25.9 mL/min/{1.73_m2}
13762 37 2009-07-17 27.1 mL/min/{1.73_m2}
13629 37 2009-11-13 18.6 mL/min/{1.73_m2}
14325 37 2010-03-05 81.1 mL/min
14337 37 2010-03-05 17.6 mL/min/{1.73_m2}
14137 37 2010-03-19 22.7 mL/min/{1.73_m2}
13607 37 2010-06-11 17.8 mL/min/{1.73_m2}
14471 37 2010-10-08 26.8 mL/min/{1.73_m2}
13221 37 2010-12-10 25.7 mL/min/{1.73_m2}
13573 37 2011-03-11 19.0 mL/min/{1.73_m2}
13376 37 2011-03-26 72.8 mL/min
14514 37 2011-04-08 20.7 mL/min/{1.73_m2}
12938 37 2011-09-09 23.3 mL/min/{1.73_m2}
12831 37 2011-10-07 21.5 mL/min/{1.73_m2}
12720 37 2012-03-16 19.4 mL/min/{1.73_m2}
14435 37 2012-03-25 83.8 mL/min
14014 37 2012-06-01 28.9 mL/min/{1.73_m2}
13137 37 2012-11-30 21.9 mL/min/{1.73_m2}
13875 37 2013-03-01 23.9 mL/min/{1.73_m2}
14552 37 2013-03-22 17.0 mL/min/{1.73_m2}
13352 37 2013-03-25 65.2 mL/min
14661 37 2013-09-27 16.7 mL/min/{1.73_m2}
13750 37 2014-03-25 88.5 mL/min
13758 37 2014-03-28 25.8 mL/min/{1.73_m2}
12844 37 2014-04-25 23.4 mL/min/{1.73_m2}
12766 37 2014-08-22 17.5 mL/min/{1.73_m2}
12734 37 2014-10-17 26.1 mL/min/{1.73_m2}
13508 37 2014-12-19 21.9 mL/min/{1.73_m2}
14256 37 2015-03-25 77.9 mL/min
14601 37 2015-04-03 25.9 mL/min/{1.73_m2}
14237 37 2015-11-13 18.9 mL/min/{1.73_m2}
14003 37 2016-03-24 72.3 mL/min
12813 37 2016-04-08 15.5 mL/min/{1.73_m2}
13980 37 2016-06-10 17.9 mL/min/{1.73_m2}
13128 37 2016-12-09 27.2 mL/min/{1.73_m2}
13454 37 2017-03-24 66.8 mL/min
12891 37 2017-04-14 26.6 mL/min/{1.73_m2}
13439 37 2017-06-09 18.6 mL/min/{1.73_m2}
14153 37 2017-09-01 23.7 mL/min/{1.73_m2}
13238 37 2018-01-05 16.2 mL/min/{1.73_m2}
14195 37 2018-03-02 27.6 mL/min/{1.73_m2}
12752 37 2018-03-24 82.0 mL/min
13853 37 2018-04-20 21.6 mL/min/{1.73_m2}
12928 37 2018-06-01 24.1 mL/min/{1.73_m2}
14261 37 2019-03-24 72.9 mL/min
14587 37 2019-04-26 27.0 mL/min/{1.73_m2}
14187 37 2019-05-24 26.3 mL/min/{1.73_m2}
13922 37 2019-07-26 17.2 mL/min/{1.73_m2}
13486 37 2019-10-25 27.5 mL/min/{1.73_m2}
13666 37 2020-03-23 73.9 mL/min
13419 37 2020-04-24 23.7 mL/min/{1.73_m2}
13331 37 2020-05-08 21.1 mL/min/{1.73_m2}
13022 37 2020-05-22 20.2 mL/min/{1.73_m2}
14057 37 2020-07-17 25.6 mL/min/{1.73_m2}
14682 37 2020-12-18 27.2 mL/min/{1.73_m2}
14428 37 2021-02-12 17.9 mL/min/{1.73_m2}
14113 37 2021-03-23 77.1 mL/min
14484 37 2021-05-14 19.3 mL/min/{1.73_m2}
13646 37 2021-05-28 25.9 mL/min/{1.73_m2}
13464 37 2021-06-18 26.8 mL/min/{1.73_m2}
13037 37 2021-08-13 25.3 mL/min/{1.73_m2}
13675 37 2021-09-10 27.4 mL/min/{1.73_m2}
14077 37 2022-03-23 69.5 mL/min
12661 37 2022-05-13 16.0 mL/min/{1.73_m2}
13830 37 2022-05-27 27.3 mL/min/{1.73_m2}

Produce events for Atlas

One approach to use Phea’s results inside Atlas is to produce novel records (e.g. new rows in OBSERVATION or MEASUREMENT) using custom concept IDs, then ETL those back into the dataset.

Below I illustrate how to do it with case A, assuming the new records will go into MEASUREMENT. Notice I put the difference inside value_as_number, just in case.

case_a_custom_concept_id <- 2000000001
# Case A: SCr increase by >=0.3 over 48 hours
data_for_etl <- scr_change |>
  filter(scr_case_a) |> # Keep only rows where case A was TRUE
  transmute(
    measurement_datetime = ts,
    person_id = pid,
    value_as_number = scr_value_as_number - scr_48h_min_value_as_number,
    measurement_concept_id = case_a_custom_concept_id)

head_shot(data_for_etl) |>
  kable()
measurement_datetime person_id value_as_number measurement_concept_id
2022-02-11 1 0.3 2000000001
2016-01-29 3 0.5 2000000001
2016-11-11 3 1.7 2000000001
2016-11-27 19 1.6 2000000001
2018-12-16 19 0.8 2000000001
2019-11-03 19 1.0 2000000001
2020-11-22 19 0.6 2000000001
2021-11-14 19 0.8 2000000001
2021-02-21 30 0.4 2000000001
2021-02-23 30 0.3 2000000001

Author contact

Fabrício Kury – Please be always welcome to reach me at .